Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](parquet-reader)Implement late materialization of parquet complex types. #44098

Merged

Conversation

kaka11chen
Copy link
Contributor

@kaka11chen kaka11chen commented Nov 17, 2024

What problem does this PR solve?

Problem Summary:
Late materialization is not supported when querying fields with complex types.

Release note

  • Separation of Filter Logic
    Split ColumnSelectVector into two components: FilterMap and ColumnSelectVectorFilterMap becomes stateless, with filter_map_index now being tracked by individual ColumnReaders. This separation provides better encapsulation and more flexible filter management.
  • Complex Type Processing Enhancement
    During reading operations, generates new nested filter maps dynamically. Implements dual tracking mechanism:
    Uses filter_map_index for current filter map tracking. Maintains _orig_filter_map_index to preserve original filter map tracking.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -316,9 +318,15 @@ Status ScalarColumnReader::_read_values(size_t num_values, ColumnPtr& doris_colu
* whether the reader should read the remaining value of the last row in previous page.
*/
Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_read_nested_column' has cognitive complexity of 77 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:330: +1, including nesting penalty of 0, nesting level increased to 1

    if (align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:335: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:344: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_rep_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:346: +2, including nesting penalty of 1, nesting level increased to 2

        while (parsed_rows <= batch_size && remaining_values > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:346: +1

        while (parsed_rows <= batch_size && remaining_values > 0) {
                                         ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:348: +3, including nesting penalty of 2, nesting level increased to 3

            if (rep_level == 0) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:349: +4, including nesting penalty of 3, nesting level increased to 4

                if (parsed_rows == batch_size) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:359: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map.has_filter()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:363: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:363: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:368: +1, nesting level increased to 1

    } else if (!align_rows) {
           ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:378: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_def_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:380: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:387: +1, including nesting penalty of 0, nesting level increased to 1

    if (doris_column->is_nullable()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:393: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:394: +2, including nesting penalty of 1, nesting level increased to 2

        if (_field_schema->is_nullable) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:404: +1, including nesting penalty of 0, nesting level increased to 1

    while (has_read < origin_size + parsed_values) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:407: +2, including nesting penalty of 1, nesting level increased to 2

        while (has_read < origin_size + parsed_values && _def_levels[has_read] == def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:411: +2, including nesting penalty of 1, nesting level increased to 2

        if (def_level < _field_schema->repeated_parent_def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:418: +2, including nesting penalty of 1, nesting level increased to 2

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:418: +1

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
                                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:422: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:423: +3, including nesting penalty of 2, nesting level increased to 3

            if (!(prev_is_null ^ is_null)) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:427: +3, including nesting penalty of 2, nesting level increased to 3

            while (remaining > USHRT_MAX) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:442: +1, including nesting penalty of 0, nesting level increased to 1

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:442: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:447: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_chunk_reader->decode_values(data_column, type, select_vector, is_dict_filter));
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:447: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_chunk_reader->decode_values(data_column, type, select_vector, is_dict_filter));
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:448: +1, including nesting penalty of 0, nesting level increased to 1

    if (ancestor_nulls != 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:449: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:449: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:452: +1, including nesting penalty of 0, nesting level increased to 1

    if (!align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:456: +1, including nesting penalty of 0, nesting level increased to 1

    if (_chunk_reader->remaining_num_values() == 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:457: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->has_next_page()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:458: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:458: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:459: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:459: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:462: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:467: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:488: +1, including nesting penalty of 0, nesting level increased to 1

    if (_rep_levels.size() > 0) {
    ^

@@ -476,7 +530,7 @@
}

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'read_column_data' has cognitive complexity of 81 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:534: +1, including nesting penalty of 0, nesting level increased to 1

    if (_converter == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:537: +2, including nesting penalty of 1, nesting level increased to 2

        if (!_converter->support()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:546: +1, including nesting penalty of 0, nesting level increased to 1

    do {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:547: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:548: +3, including nesting penalty of 2, nesting level increased to 3

            if (!_chunk_reader->has_next_page()) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:553: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:553: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:555: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:556: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:556: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:557: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:557: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:567: +2, including nesting penalty of 1, nesting level increased to 2

        if (read_ranges.size() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:570: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:570: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:572: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:576: +3, including nesting penalty of 2, nesting level increased to 3

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:576: +1

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
                                        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:582: +4, including nesting penalty of 3, nesting level increased to 4

                if (batch_size >= remaining_num_values &&
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:582: +1

                if (batch_size >= remaining_num_values &&
                                                       ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:587: +5, including nesting penalty of 4, nesting level increased to 5

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:587: +6, including nesting penalty of 5, nesting level increased to 6

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:589: +5, including nesting penalty of 4, nesting level increased to 5

                    if (!_chunk_reader->has_next_page()) {
                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:594: +1

                skip_whole_batch = batch_size <= remaining_num_values &&
                                                                      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:596: +4, including nesting penalty of 3, nesting level increased to 4

                if (skip_whole_batch) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:601: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:601: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:626: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0 && !_chunk_reader->has_next_page()) {
        ^

@@ -445,7 +446,7 @@ Status RowGroupReader::_read_column_data(Block* block, const std::vector<std::st

Status RowGroupReader::_do_lazy_read(Block* block, size_t batch_size, size_t* read_rows,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_do_lazy_read' has cognitive complexity of 109 (threshold 50) [readability-function-cognitive-complexity]

Status RowGroupReader::_do_lazy_read(Block* block, size_t batch_size, size_t* read_rows,
                       ^
Additional context

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:454: +1, including nesting penalty of 0, nesting level increased to 1

    for (uint32_t i = 0; i < origin_column_num; ++i) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:459: +1, including nesting penalty of 0, nesting level increased to 1

    while (!_state->is_cancelled()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:464: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.predicate_columns.first, batch_size,
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:464: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.predicate_columns.first, batch_size,
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:466: +2, including nesting penalty of 1, nesting level increased to 2

        if (pre_read_rows == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:471: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_fill_partition_columns(block, pre_read_rows,
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:471: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_fill_partition_columns(block, pre_read_rows,
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:473: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_fill_missing_columns(block, pre_read_rows,
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:473: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_fill_missing_columns(block, pre_read_rows,
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:476: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_build_pos_delete_filter(pre_read_rows));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:476: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_build_pos_delete_filter(pre_read_rows));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:479: +2, including nesting penalty of 1, nesting level increased to 2

        if (_lazy_read_ctx.resize_first_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:487: +2, including nesting penalty of 1, nesting level increased to 2

        if (_position_delete_ctx.has_filter) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:498: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_ERROR(VExprContext::execute_conjuncts(filter_contexts, &filters, block,
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:498: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(VExprContext::execute_conjuncts(filter_contexts, &filters, block,
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:502: +2, including nesting penalty of 1, nesting level increased to 2

        if (_lazy_read_ctx.resize_first_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:509: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(filter_map_ptr->init(filter_map_data, pre_read_rows, can_filter_all));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:509: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(filter_map_ptr->init(filter_map_data, pre_read_rows, can_filter_all));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:510: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map_ptr->filter_all()) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:523: +3, including nesting penalty of 2, nesting level increased to 3

            if (!pre_eof) {
            ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:524: +4, including nesting penalty of 3, nesting level increased to 4

                if (pre_raw_read_rows >= config::doris_scanner_row_num) {
                ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:529: +1, nesting level increased to 3

            } else { // pre_eof
              ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:537: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:541: +1, including nesting penalty of 0, nesting level increased to 1

    if (_state->is_cancelled()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:545: +1, including nesting penalty of 0, nesting level increased to 1

    if (filter_map_ptr == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:554: +1, including nesting penalty of 0, nesting level increased to 1

    if (_cached_filtered_rows != 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:555: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_rebuild_filter_map(filter_map, rebuild_filter_map, pre_read_rows));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:555: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_rebuild_filter_map(filter_map, rebuild_filter_map, pre_read_rows));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:563: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.lazy_read_columns, pre_read_rows,
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:563: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.lazy_read_columns, pre_read_rows,
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:566: +1, including nesting penalty of 0, nesting level increased to 1

    if (pre_read_rows != lazy_read_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:573: +1, including nesting penalty of 0, nesting level increased to 1

    if (filter_map.has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:574: +2, including nesting penalty of 1, nesting level increased to 2

        if (block->columns() == origin_column_num) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:578: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:579: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:79: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

    do {                                                                                         \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:579: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:84: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

        } catch (const doris::Exception& e) {                                                    \
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:579: +5, including nesting penalty of 4, nesting level increased to 5

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:85: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

            if (e.code() == doris::ErrorCode::MEM_ALLOC_FAILED) {                                \
            ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:583: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:591: +1, including nesting penalty of 0, nesting level increased to 1

    for (int i = 0; i < column_num; ++i) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:593: +2, including nesting penalty of 1, nesting level increased to 2

        if (column_size != 0 && cz != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:593: +1

        if (column_size != 0 && cz != 0) {
                             ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:596: +2, including nesting penalty of 1, nesting level increased to 2

        if (cz != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:604: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_fill_partition_columns(block, column_size, _lazy_read_ctx.partition_columns));
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:604: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_fill_partition_columns(block, column_size, _lazy_read_ctx.partition_columns));
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:605: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_fill_missing_columns(block, column_size, _lazy_read_ctx.missing_columns));
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:605: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_fill_missing_columns(block, column_size, _lazy_read_ctx.missing_columns));
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:606: +1, including nesting penalty of 0, nesting level increased to 1

    if (!_not_single_slot_filter_conjuncts.empty()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:609: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_CATCH_EXCEPTION(
            ^

be/src/common/exception.h:79: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

    do {                                                                                         \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:610: +3, including nesting penalty of 2, nesting level increased to 3

                    RETURN_IF_ERROR(VExprContext::execute_conjuncts_and_filter_block(
                    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:610: +4, including nesting penalty of 3, nesting level increased to 4

                    RETURN_IF_ERROR(VExprContext::execute_conjuncts_and_filter_block(
                    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:609: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_CATCH_EXCEPTION(
            ^

be/src/common/exception.h:84: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

        } catch (const doris::Exception& e) {                                                    \
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:609: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_CATCH_EXCEPTION(
            ^

be/src/common/exception.h:85: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

            if (e.code() == doris::ErrorCode::MEM_ALLOC_FAILED) {                                \
            ^

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from 24c9cf6 to 30bf98b Compare November 17, 2024 18:15
@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -476,7 +525,7 @@ Status ScalarColumnReader::_try_load_dict_page(bool* loaded, bool* has_dict) {
}

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'read_column_data' has cognitive complexity of 81 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:529: +1, including nesting penalty of 0, nesting level increased to 1

    if (_converter == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:532: +2, including nesting penalty of 1, nesting level increased to 2

        if (!_converter->support()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:541: +1, including nesting penalty of 0, nesting level increased to 1

    do {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:542: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:543: +3, including nesting penalty of 2, nesting level increased to 3

            if (!_chunk_reader->has_next_page()) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:548: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:548: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:550: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:551: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:551: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:552: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:552: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:562: +2, including nesting penalty of 1, nesting level increased to 2

        if (read_ranges.size() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:565: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:565: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:567: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:571: +3, including nesting penalty of 2, nesting level increased to 3

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:571: +1

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
                                        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:577: +4, including nesting penalty of 3, nesting level increased to 4

                if (batch_size >= remaining_num_values &&
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:577: +1

                if (batch_size >= remaining_num_values &&
                                                       ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:582: +5, including nesting penalty of 4, nesting level increased to 5

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:582: +6, including nesting penalty of 5, nesting level increased to 6

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:584: +5, including nesting penalty of 4, nesting level increased to 5

                    if (!_chunk_reader->has_next_page()) {
                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:589: +1

                skip_whole_batch = batch_size <= remaining_num_values &&
                                                                      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:591: +4, including nesting penalty of 3, nesting level increased to 4

                if (skip_whole_batch) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:596: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:596: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:621: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0 && !_chunk_reader->has_next_page()) {
        ^

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from 30bf98b to 5ef9e17 Compare November 18, 2024 07:05
@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -316,9 +319,15 @@ Status ScalarColumnReader::_read_values(size_t num_values, ColumnPtr& doris_colu
* whether the reader should read the remaining value of the last row in previous page.
*/
Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_read_nested_column' has cognitive complexity of 77 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:331: +1, including nesting penalty of 0, nesting level increased to 1

    if (align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:336: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:345: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_rep_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:347: +2, including nesting penalty of 1, nesting level increased to 2

        while (parsed_rows <= batch_size && remaining_values > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:347: +1

        while (parsed_rows <= batch_size && remaining_values > 0) {
                                         ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:349: +3, including nesting penalty of 2, nesting level increased to 3

            if (rep_level == 0) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:350: +4, including nesting penalty of 3, nesting level increased to 4

                if (parsed_rows == batch_size) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:360: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map.has_filter()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:364: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:364: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:369: +1, nesting level increased to 1

    } else if (!align_rows) {
           ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:379: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_def_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:381: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:388: +1, including nesting penalty of 0, nesting level increased to 1

    if (doris_column->is_nullable()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:394: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:395: +2, including nesting penalty of 1, nesting level increased to 2

        if (_field_schema->is_nullable) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:405: +1, including nesting penalty of 0, nesting level increased to 1

    while (has_read < origin_size + parsed_values) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:408: +2, including nesting penalty of 1, nesting level increased to 2

        while (has_read < origin_size + parsed_values && _def_levels[has_read] == def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:412: +2, including nesting penalty of 1, nesting level increased to 2

        if (def_level < _field_schema->repeated_parent_def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:419: +2, including nesting penalty of 1, nesting level increased to 2

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:419: +1

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
                                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:423: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:424: +3, including nesting penalty of 2, nesting level increased to 3

            if (!(prev_is_null ^ is_null)) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:428: +3, including nesting penalty of 2, nesting level increased to 3

            while (remaining > USHRT_MAX) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:443: +1, including nesting penalty of 0, nesting level increased to 1

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:443: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:448: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_chunk_reader->decode_values(data_column, type, select_vector, is_dict_filter));
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:448: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_chunk_reader->decode_values(data_column, type, select_vector, is_dict_filter));
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:449: +1, including nesting penalty of 0, nesting level increased to 1

    if (ancestor_nulls != 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:450: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:450: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:453: +1, including nesting penalty of 0, nesting level increased to 1

    if (!align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:457: +1, including nesting penalty of 0, nesting level increased to 1

    if (_chunk_reader->remaining_num_values() == 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:458: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->has_next_page()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:459: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:459: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:460: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:460: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:463: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:468: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:489: +1, including nesting penalty of 0, nesting level increased to 1

    if (_rep_levels.size() > 0) {
    ^

@@ -476,7 +526,7 @@
}

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'read_column_data' has cognitive complexity of 81 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:530: +1, including nesting penalty of 0, nesting level increased to 1

    if (_converter == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:533: +2, including nesting penalty of 1, nesting level increased to 2

        if (!_converter->support()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:542: +1, including nesting penalty of 0, nesting level increased to 1

    do {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:543: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:544: +3, including nesting penalty of 2, nesting level increased to 3

            if (!_chunk_reader->has_next_page()) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:549: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:549: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:551: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:552: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:552: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:553: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:553: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:563: +2, including nesting penalty of 1, nesting level increased to 2

        if (read_ranges.size() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:566: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:566: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:568: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:572: +3, including nesting penalty of 2, nesting level increased to 3

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:572: +1

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
                                        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:578: +4, including nesting penalty of 3, nesting level increased to 4

                if (batch_size >= remaining_num_values &&
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:578: +1

                if (batch_size >= remaining_num_values &&
                                                       ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:583: +5, including nesting penalty of 4, nesting level increased to 5

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:583: +6, including nesting penalty of 5, nesting level increased to 6

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:585: +5, including nesting penalty of 4, nesting level increased to 5

                    if (!_chunk_reader->has_next_page()) {
                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:590: +1

                skip_whole_batch = batch_size <= remaining_num_values &&
                                                                      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:592: +4, including nesting penalty of 3, nesting level increased to 4

                if (skip_whole_batch) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:597: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:597: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:622: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0 && !_chunk_reader->has_next_page()) {
        ^

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from 5ef9e17 to 046228a Compare November 18, 2024 14:34
@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -316,9 +319,15 @@ Status ScalarColumnReader::_read_values(size_t num_values, ColumnPtr& doris_colu
* whether the reader should read the remaining value of the last row in previous page.
*/
Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_read_nested_column' has cognitive complexity of 119 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:331: +1, including nesting penalty of 0, nesting level increased to 1

    if (align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:336: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:345: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_rep_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:347: +2, including nesting penalty of 1, nesting level increased to 2

        while (parsed_rows <= batch_size && remaining_values > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:347: +1

        while (parsed_rows <= batch_size && remaining_values > 0) {
                                         ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:349: +3, including nesting penalty of 2, nesting level increased to 3

            if (rep_level == 0) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:350: +4, including nesting penalty of 3, nesting level increased to 4

                if (parsed_rows == batch_size) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:360: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map.has_filter()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:364: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:364: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:369: +1, nesting level increased to 1

    } else if (!align_rows) {
           ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:379: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_def_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:381: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:388: +1, including nesting penalty of 0, nesting level increased to 1

    if (doris_column->is_nullable()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:394: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:395: +2, including nesting penalty of 1, nesting level increased to 2

        if (_field_schema->is_nullable) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:408: +1, including nesting penalty of 0, nesting level increased to 1

    while (has_read < origin_size + parsed_values) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:411: +2, including nesting penalty of 1, nesting level increased to 2

        while (has_read < origin_size + parsed_values && _def_levels[has_read] == def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:416: +2, including nesting penalty of 1, nesting level increased to 2

        if (def_level < _field_schema->repeated_parent_def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:422: +2, including nesting penalty of 1, nesting level increased to 2

        if (is_null) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:424: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:428: +2, including nesting penalty of 1, nesting level increased to 2

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:428: +1

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
                                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:430: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:431: +3, including nesting penalty of 2, nesting level increased to 3

            if (!(prev_is_null ^ is_null)) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:435: +3, including nesting penalty of 2, nesting level increased to 3

            while (remaining > USHRT_MAX) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:447: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->filter_all()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:448: +2, including nesting penalty of 1, nesting level increased to 2

        if (null_size > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:449: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(null_size, false));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:449: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(null_size, false));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:451: +2, including nesting penalty of 1, nesting level increased to 2

        if (nonnull_size > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:452: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(nonnull_size, true));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:452: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(nonnull_size, true));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:454: +2, including nesting penalty of 1, nesting level increased to 2

        if (ancestor_nulls != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:455: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:455: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:457: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:461: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_ERROR(
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:461: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:466: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:466: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:468: +2, including nesting penalty of 1, nesting level increased to 2

        if (ancestor_nulls != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:469: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:469: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:473: +1, including nesting penalty of 0, nesting level increased to 1

    if (!align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:477: +1, including nesting penalty of 0, nesting level increased to 1

    if (_chunk_reader->remaining_num_values() == 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:478: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->has_next_page()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:479: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:479: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:480: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:480: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:483: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:488: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:489: +2, including nesting penalty of 1, nesting level increased to 2

        if (current_filter_map->filter_all()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:492: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:514: +1, including nesting penalty of 0, nesting level increased to 1

    if (_rep_levels.size() > 0) {
    ^

@@ -476,7 +551,7 @@
}

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'read_column_data' has cognitive complexity of 81 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:555: +1, including nesting penalty of 0, nesting level increased to 1

    if (_converter == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:558: +2, including nesting penalty of 1, nesting level increased to 2

        if (!_converter->support()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:567: +1, including nesting penalty of 0, nesting level increased to 1

    do {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:568: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:569: +3, including nesting penalty of 2, nesting level increased to 3

            if (!_chunk_reader->has_next_page()) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:574: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:574: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:576: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:577: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:577: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:578: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:578: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:588: +2, including nesting penalty of 1, nesting level increased to 2

        if (read_ranges.size() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:591: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:591: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:593: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:597: +3, including nesting penalty of 2, nesting level increased to 3

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:597: +1

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
                                        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:603: +4, including nesting penalty of 3, nesting level increased to 4

                if (batch_size >= remaining_num_values &&
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:603: +1

                if (batch_size >= remaining_num_values &&
                                                       ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:608: +5, including nesting penalty of 4, nesting level increased to 5

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:608: +6, including nesting penalty of 5, nesting level increased to 6

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:610: +5, including nesting penalty of 4, nesting level increased to 5

                    if (!_chunk_reader->has_next_page()) {
                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:615: +1

                skip_whole_batch = batch_size <= remaining_num_values &&
                                                                      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:617: +4, including nesting penalty of 3, nesting level increased to 4

                if (skip_whole_batch) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:622: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:622: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:647: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0 && !_chunk_reader->has_next_page()) {
        ^

@kaka11chen
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -445,7 +437,7 @@ Status RowGroupReader::_read_column_data(Block* block, const std::vector<std::st

Status RowGroupReader::_do_lazy_read(Block* block, size_t batch_size, size_t* read_rows,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_do_lazy_read' has cognitive complexity of 92 (threshold 50) [readability-function-cognitive-complexity]

Status RowGroupReader::_do_lazy_read(Block* block, size_t batch_size, size_t* read_rows,
                       ^
Additional context

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:445: +1, including nesting penalty of 0, nesting level increased to 1

    for (uint32_t i = 0; i < origin_column_num; ++i) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:450: +1, including nesting penalty of 0, nesting level increased to 1

    while (!_state->is_cancelled()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:455: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.predicate_columns.first, batch_size,
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:455: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.predicate_columns.first, batch_size,
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:457: +2, including nesting penalty of 1, nesting level increased to 2

        if (pre_read_rows == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:462: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_fill_partition_columns(block, pre_read_rows,
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:462: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_fill_partition_columns(block, pre_read_rows,
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:464: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_fill_missing_columns(block, pre_read_rows,
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:464: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_fill_missing_columns(block, pre_read_rows,
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:467: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_build_pos_delete_filter(pre_read_rows));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:467: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_build_pos_delete_filter(pre_read_rows));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:470: +2, including nesting penalty of 1, nesting level increased to 2

        if (_lazy_read_ctx.resize_first_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:478: +2, including nesting penalty of 1, nesting level increased to 2

        if (_position_delete_ctx.has_filter) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:489: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_ERROR(VExprContext::execute_conjuncts(filter_contexts, &filters, block,
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:489: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(VExprContext::execute_conjuncts(filter_contexts, &filters, block,
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:493: +2, including nesting penalty of 1, nesting level increased to 2

        if (_lazy_read_ctx.resize_first_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:500: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(filter_map_ptr->init(filter_map_data, pre_read_rows, can_filter_all));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:500: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(filter_map_ptr->init(filter_map_data, pre_read_rows, can_filter_all));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:501: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map_ptr->filter_all()) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:514: +3, including nesting penalty of 2, nesting level increased to 3

            if (!pre_eof) {
            ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:515: +4, including nesting penalty of 3, nesting level increased to 4

                if (pre_raw_read_rows >= config::doris_scanner_row_num) {
                ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:520: +1, nesting level increased to 3

            } else { // pre_eof
              ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:528: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:532: +1, including nesting penalty of 0, nesting level increased to 1

    if (_state->is_cancelled()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:536: +1, including nesting penalty of 0, nesting level increased to 1

    if (filter_map_ptr == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:545: +1, including nesting penalty of 0, nesting level increased to 1

    if (_cached_filtered_rows != 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:546: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_rebuild_filter_map(filter_map, rebuild_filter_map, pre_read_rows));
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:546: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_rebuild_filter_map(filter_map, rebuild_filter_map, pre_read_rows));
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:554: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.lazy_read_columns, pre_read_rows,
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:554: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.lazy_read_columns, pre_read_rows,
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:557: +1, including nesting penalty of 0, nesting level increased to 1

    if (pre_read_rows != lazy_read_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:564: +1, including nesting penalty of 0, nesting level increased to 1

    if (filter_map.has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:565: +2, including nesting penalty of 1, nesting level increased to 2

        if (block->columns() == origin_column_num) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:569: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:570: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:79: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

    do {                                                                                         \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:570: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:84: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

        } catch (const doris::Exception& e) {                                                    \
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:570: +5, including nesting penalty of 4, nesting level increased to 5

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:85: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

            if (e.code() == doris::ErrorCode::MEM_ALLOC_FAILED) {                                \
            ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:574: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:582: +1, including nesting penalty of 0, nesting level increased to 1

    for (int i = 0; i < column_num; ++i) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:584: +2, including nesting penalty of 1, nesting level increased to 2

        if (column_size != 0 && cz != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:584: +1

        if (column_size != 0 && cz != 0) {
                             ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:587: +2, including nesting penalty of 1, nesting level increased to 2

        if (cz != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:595: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_fill_partition_columns(block, column_size, _lazy_read_ctx.partition_columns));
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:595: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_fill_partition_columns(block, column_size, _lazy_read_ctx.partition_columns));
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:596: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_fill_missing_columns(block, column_size, _lazy_read_ctx.missing_columns));
    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:596: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_fill_missing_columns(block, column_size, _lazy_read_ctx.missing_columns));
    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from 210df93 to 343c348 Compare November 21, 2024 13:38
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -316,9 +319,12 @@ Status ScalarColumnReader::_read_values(size_t num_values, ColumnPtr& doris_colu
* whether the reader should read the remaining value of the last row in previous page.
*/
Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_read_nested_column' has cognitive complexity of 124 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:328: +1, including nesting penalty of 0, nesting level increased to 1

    if (align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:333: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:336: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_filter_map_data) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:346: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_rep_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:349: +2, including nesting penalty of 1, nesting level increased to 2

        while (parsed_rows <= batch_size && remaining_values > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:349: +1

        while (parsed_rows <= batch_size && remaining_values > 0) {
                                         ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:351: +3, including nesting penalty of 2, nesting level increased to 3

            if (rep_level == 0) { // rep_level 0 indicates start of new row
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:352: +4, including nesting penalty of 3, nesting level increased to 4

                if (parsed_rows == batch_size) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:363: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map.has_filter() && (!filter_map.filter_all())) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:363: +1

        if (filter_map.has_filter() && (!filter_map.filter_all())) {
                                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:364: +3, including nesting penalty of 2, nesting level increased to 3

            if (_nested_filter_map_data == nullptr) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:367: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:367: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:373: +1, nesting level increased to 1

    } else if (!align_rows) {
           ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:383: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_def_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:385: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:393: +1, including nesting penalty of 0, nesting level increased to 1

    if (doris_column->is_nullable()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:399: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:400: +2, including nesting penalty of 1, nesting level increased to 2

        if (_field_schema->is_nullable) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:414: +1, including nesting penalty of 0, nesting level increased to 1

    while (has_read < origin_size + parsed_values) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:417: +2, including nesting penalty of 1, nesting level increased to 2

        while (has_read < origin_size + parsed_values && _def_levels[has_read] == def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:422: +2, including nesting penalty of 1, nesting level increased to 2

        if (def_level < _field_schema->repeated_parent_def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:428: +2, including nesting penalty of 1, nesting level increased to 2

        if (is_null) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:430: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:434: +2, including nesting penalty of 1, nesting level increased to 2

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:434: +1

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
                                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:436: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:437: +3, including nesting penalty of 2, nesting level increased to 3

            if (!(prev_is_null ^ is_null)) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:441: +3, including nesting penalty of 2, nesting level increased to 3

            while (remaining > USHRT_MAX) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:454: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->filter_all()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:456: +2, including nesting penalty of 1, nesting level increased to 2

        if (null_size > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:457: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(null_size, false));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:457: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(null_size, false));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:459: +2, including nesting penalty of 1, nesting level increased to 2

        if (nonnull_size > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:460: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(nonnull_size, true));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:460: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(nonnull_size, true));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:462: +2, including nesting penalty of 1, nesting level increased to 2

        if (ancestor_nulls != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:463: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:463: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:465: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:469: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_ERROR(
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:469: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:474: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:474: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:476: +2, including nesting penalty of 1, nesting level increased to 2

        if (ancestor_nulls != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:477: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:477: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:484: +1, including nesting penalty of 0, nesting level increased to 1

    if (_chunk_reader->remaining_num_values() == 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:485: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->has_next_page()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:486: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:486: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:487: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:487: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:490: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:496: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:497: +2, including nesting penalty of 1, nesting level increased to 2

        if (current_filter_map->filter_all()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:500: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:523: +1, including nesting penalty of 0, nesting level increased to 1

    if (_rep_levels.size() > 0) {
    ^

@@ -476,7 +560,7 @@
}

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'read_column_data' has cognitive complexity of 81 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:564: +1, including nesting penalty of 0, nesting level increased to 1

    if (_converter == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:567: +2, including nesting penalty of 1, nesting level increased to 2

        if (!_converter->support()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:576: +1, including nesting penalty of 0, nesting level increased to 1

    do {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:577: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:578: +3, including nesting penalty of 2, nesting level increased to 3

            if (!_chunk_reader->has_next_page()) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:583: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:583: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:585: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:586: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:586: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:587: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:587: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:597: +2, including nesting penalty of 1, nesting level increased to 2

        if (read_ranges.size() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:600: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:600: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:602: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:606: +3, including nesting penalty of 2, nesting level increased to 3

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:606: +1

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
                                        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:612: +4, including nesting penalty of 3, nesting level increased to 4

                if (batch_size >= remaining_num_values &&
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:612: +1

                if (batch_size >= remaining_num_values &&
                                                       ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:617: +5, including nesting penalty of 4, nesting level increased to 5

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:617: +6, including nesting penalty of 5, nesting level increased to 6

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:619: +5, including nesting penalty of 4, nesting level increased to 5

                    if (!_chunk_reader->has_next_page()) {
                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:624: +1

                skip_whole_batch = batch_size <= remaining_num_values &&
                                                                      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:626: +4, including nesting penalty of 3, nesting level increased to 4

                if (skip_whole_batch) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:631: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:631: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:631: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:633: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:656: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0 && !_chunk_reader->has_next_page()) {
        ^

// specific language governing permissions and limitations
// under the License.

#include <gtest/gtest.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'gtest/gtest.h' file not found [clang-diagnostic-error]

#include <gtest/gtest.h>
         ^

@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from 343c348 to 14f5bfc Compare November 21, 2024 14:45
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from 14f5bfc to a82a456 Compare November 21, 2024 15:32
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from a82a456 to c635719 Compare November 21, 2024 15:48
@kaka11chen kaka11chen changed the title [opt](parquet-reader)Implement late materialization of parquet comple… [opt](parquet-reader)Implement late materialization of parquet complex types. Nov 22, 2024
@kaka11chen kaka11chen marked this pull request as ready for review November 22, 2024 01:52
@kaka11chen
Copy link
Contributor Author

run buildall

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from c635719 to e4bad17 Compare November 22, 2024 07:56
@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.34% (9989/26051)
Line Coverage: 29.47% (83662/283885)
Region Coverage: 28.63% (43062/150383)
Branch Coverage: 25.22% (21881/86770)
Coverage Report: http://coverage.selectdb-in.cc/coverage/e4bad1797a857209941cd77dc1fef3aa5993b13e_e4bad1797a857209941cd77dc1fef3aa5993b13e/report/index.html

@kaka11chen
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.36% (9987/26036)
Line Coverage: 29.47% (83646/283857)
Region Coverage: 28.64% (43062/150372)
Branch Coverage: 25.22% (21883/86762)
Coverage Report: http://coverage.selectdb-in.cc/coverage/0eceaf9351aa67d88214db16b26eca3d0167d96f_0eceaf9351aa67d88214db16b26eca3d0167d96f/report/index.html

@kaka11chen kaka11chen force-pushed the parquet_complex_type_late_materialization branch from 6811d91 to 815442d Compare December 10, 2024 01:41
@kaka11chen
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Dec 10, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions


size_t num_filtered() const { return _num_filtered; }
Status init(const std::vector<uint16_t>& run_length_null_map, size_t num_values,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'init' exceeds recommended size/complexity thresholds [readability-function-size]

    Status init(const std::vector<uint16_t>& run_length_null_map, size_t num_values,
           ^
Additional context

be/src/vec/exec/format/parquet/parquet_common.h:104: 98 lines including whitespace and comments (threshold 80)

    Status init(const std::vector<uint16_t>& run_length_null_map, size_t num_values,
           ^

@@ -316,9 +319,12 @@ Status ScalarColumnReader::_read_values(size_t num_values, ColumnPtr& doris_colu
* whether the reader should read the remaining value of the last row in previous page.
*/
Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_read_nested_column' has cognitive complexity of 127 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::_read_nested_column(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:328: +1, including nesting penalty of 0, nesting level increased to 1

    if (align_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:333: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:336: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_filter_map_data) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:346: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_rep_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:349: +2, including nesting penalty of 1, nesting level increased to 2

        while (parsed_rows <= batch_size && remaining_values > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:349: +1

        while (parsed_rows <= batch_size && remaining_values > 0) {
                                         ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:351: +3, including nesting penalty of 2, nesting level increased to 3

            if (rep_level == 0) { // rep_level 0 indicates start of new row
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:352: +4, including nesting penalty of 3, nesting level increased to 4

                if (parsed_rows == batch_size) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:363: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map.has_filter() && (!filter_map.filter_all())) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:363: +1

        if (filter_map.has_filter() && (!filter_map.filter_all())) {
                                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:364: +3, including nesting penalty of 2, nesting level increased to 3

            if (_nested_filter_map_data == nullptr) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:367: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:367: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(filter_map.generate_nested_filter_map(
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:373: +1, nesting level increased to 1

    } else if (!align_rows) {
           ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:383: +1, including nesting penalty of 0, nesting level increased to 1

    if (has_def_level) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:385: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:393: +1, including nesting penalty of 0, nesting level increased to 1

    if (doris_column->is_nullable()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:399: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:400: +2, including nesting penalty of 1, nesting level increased to 2

        if (_field_schema->is_nullable) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:415: +1, including nesting penalty of 0, nesting level increased to 1

    while (has_read < origin_size + parsed_values) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:418: +2, including nesting penalty of 1, nesting level increased to 2

        while (has_read < origin_size + parsed_values && _def_levels[has_read] == def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:423: +2, including nesting penalty of 1, nesting level increased to 2

        if (def_level < _field_schema->repeated_parent_def_level) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:424: +3, including nesting penalty of 2, nesting level increased to 3

            for (size_t i = 0; i < loop_read; i++) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:432: +2, including nesting penalty of 1, nesting level increased to 2

        if (is_null) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:434: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:438: +2, including nesting penalty of 1, nesting level increased to 2

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:438: +1

        if (prev_is_null == is_null && (USHRT_MAX - null_map.back() >= loop_read)) {
                                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:440: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:441: +3, including nesting penalty of 2, nesting level increased to 3

            if (!(prev_is_null ^ is_null)) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:445: +3, including nesting penalty of 2, nesting level increased to 3

            while (remaining > USHRT_MAX) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:458: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->filter_all()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:460: +2, including nesting penalty of 1, nesting level increased to 2

        if (null_size > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:461: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(null_size, false));
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:461: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(null_size, false));
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:463: +2, including nesting penalty of 1, nesting level increased to 2

        if (nonnull_size > 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:464: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(nonnull_size, true));
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:464: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(nonnull_size, true));
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:466: +2, including nesting penalty of 1, nesting level increased to 2

        if (ancestor_nulls != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:467: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:467: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:469: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:473: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_ERROR(
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:473: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:479: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:479: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(
        ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:481: +2, including nesting penalty of 1, nesting level increased to 2

        if (ancestor_nulls != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:482: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:482: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_values(ancestor_nulls, false));
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:489: +1, including nesting penalty of 0, nesting level increased to 1

    if (_chunk_reader->remaining_num_values() == 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:490: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->has_next_page()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:491: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:491: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:492: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:492: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data());
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:495: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:501: +1, including nesting penalty of 0, nesting level increased to 1

    if (current_filter_map->has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:502: +2, including nesting penalty of 1, nesting level increased to 2

        if (current_filter_map->filter_all()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:505: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:528: +1, including nesting penalty of 0, nesting level increased to 1

    if (_rep_levels.size() > 0) {
    ^

@@ -476,7 +565,7 @@
}

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'read_column_data' has cognitive complexity of 81 (threshold 50) [readability-function-cognitive-complexity]

Status ScalarColumnReader::read_column_data(ColumnPtr& doris_column, DataTypePtr& type,
                           ^
Additional context

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:569: +1, including nesting penalty of 0, nesting level increased to 1

    if (_converter == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:572: +2, including nesting penalty of 1, nesting level increased to 2

        if (!_converter->support()) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:581: +1, including nesting penalty of 0, nesting level increased to 1

    do {
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:582: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:583: +3, including nesting penalty of 2, nesting level increased to 3

            if (!_chunk_reader->has_next_page()) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:588: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:588: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->next_page());
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:590: +2, including nesting penalty of 1, nesting level increased to 2

        if (_nested_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:591: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:591: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:592: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:592: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_read_nested_column(resolved_column, resolved_type, filter_map,
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:602: +2, including nesting penalty of 1, nesting level increased to 2

        if (read_ranges.size() == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:605: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:605: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->skip_page());
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:607: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:611: +3, including nesting penalty of 2, nesting level increased to 3

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
            ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:611: +1

            if (filter_map.has_filter() && filter_map.filter_ratio() > 0.6) {
                                        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:617: +4, including nesting penalty of 3, nesting level increased to 4

                if (batch_size >= remaining_num_values &&
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:617: +1

                if (batch_size >= remaining_num_values &&
                                                       ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:622: +5, including nesting penalty of 4, nesting level increased to 5

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:622: +6, including nesting penalty of 5, nesting level increased to 6

                    RETURN_IF_ERROR(_chunk_reader->skip_page());
                    ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:624: +5, including nesting penalty of 4, nesting level increased to 5

                    if (!_chunk_reader->has_next_page()) {
                    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:629: +1

                skip_whole_batch = batch_size <= remaining_num_values &&
                                                                      ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:631: +4, including nesting penalty of 3, nesting level increased to 4

                if (skip_whole_batch) {
                ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:636: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:636: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_ERROR(_chunk_reader->load_page_data_idempotent());
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_column_reader.cpp:661: +2, including nesting penalty of 1, nesting level increased to 2

        if (_chunk_reader->remaining_num_values() == 0 && !_chunk_reader->has_next_page()) {
        ^

@@ -445,7 +446,7 @@ Status RowGroupReader::_read_column_data(Block* block, const std::vector<std::st

Status RowGroupReader::_do_lazy_read(Block* block, size_t batch_size, size_t* read_rows,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_do_lazy_read' has cognitive complexity of 109 (threshold 50) [readability-function-cognitive-complexity]

Status RowGroupReader::_do_lazy_read(Block* block, size_t batch_size, size_t* read_rows,
                       ^
Additional context

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:454: +1, including nesting penalty of 0, nesting level increased to 1

    for (uint32_t i = 0; i < origin_column_num; ++i) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:459: +1, including nesting penalty of 0, nesting level increased to 1

    while (!_state->is_cancelled()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:464: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.predicate_columns.first, batch_size,
        ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:464: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.predicate_columns.first, batch_size,
        ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:466: +2, including nesting penalty of 1, nesting level increased to 2

        if (pre_read_rows == 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:471: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_fill_partition_columns(block, pre_read_rows,
        ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:471: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_fill_partition_columns(block, pre_read_rows,
        ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:473: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_fill_missing_columns(block, pre_read_rows,
        ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:473: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_fill_missing_columns(block, pre_read_rows,
        ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:476: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_build_pos_delete_filter(pre_read_rows));
        ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:476: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_build_pos_delete_filter(pre_read_rows));
        ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:479: +2, including nesting penalty of 1, nesting level increased to 2

        if (_lazy_read_ctx.resize_first_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:487: +2, including nesting penalty of 1, nesting level increased to 2

        if (_position_delete_ctx.has_filter) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:498: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_ERROR(VExprContext::execute_conjuncts(filter_contexts, &filters, block,
            ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:498: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_ERROR(VExprContext::execute_conjuncts(filter_contexts, &filters, block,
            ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:502: +2, including nesting penalty of 1, nesting level increased to 2

        if (_lazy_read_ctx.resize_first_column) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:509: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(filter_map_ptr->init(filter_map_data, pre_read_rows, can_filter_all));
        ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:509: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(filter_map_ptr->init(filter_map_data, pre_read_rows, can_filter_all));
        ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:510: +2, including nesting penalty of 1, nesting level increased to 2

        if (filter_map_ptr->filter_all()) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:523: +3, including nesting penalty of 2, nesting level increased to 3

            if (!pre_eof) {
            ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:524: +4, including nesting penalty of 3, nesting level increased to 4

                if (pre_raw_read_rows >= config::doris_scanner_row_num) {
                ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:529: +1, nesting level increased to 3

            } else { // pre_eof
              ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:537: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:541: +1, including nesting penalty of 0, nesting level increased to 1

    if (_state->is_cancelled()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:545: +1, including nesting penalty of 0, nesting level increased to 1

    if (filter_map_ptr == nullptr) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:554: +1, including nesting penalty of 0, nesting level increased to 1

    if (_cached_filtered_rows != 0) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:555: +2, including nesting penalty of 1, nesting level increased to 2

        RETURN_IF_ERROR(_rebuild_filter_map(filter_map, rebuild_filter_map, pre_read_rows));
        ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:555: +3, including nesting penalty of 2, nesting level increased to 3

        RETURN_IF_ERROR(_rebuild_filter_map(filter_map, rebuild_filter_map, pre_read_rows));
        ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:563: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.lazy_read_columns, pre_read_rows,
    ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:563: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_read_column_data(block, _lazy_read_ctx.lazy_read_columns, pre_read_rows,
    ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:566: +1, including nesting penalty of 0, nesting level increased to 1

    if (pre_read_rows != lazy_read_rows) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:573: +1, including nesting penalty of 0, nesting level increased to 1

    if (filter_map.has_filter()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:574: +2, including nesting penalty of 1, nesting level increased to 2

        if (block->columns() == origin_column_num) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:578: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:579: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:79: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

    do {                                                                                         \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:579: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:84: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

        } catch (const doris::Exception& e) {                                                    \
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:579: +5, including nesting penalty of 4, nesting level increased to 5

            RETURN_IF_CATCH_EXCEPTION(Block::filter_block_internal(
            ^

be/src/common/exception.h:85: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

            if (e.code() == doris::ErrorCode::MEM_ALLOC_FAILED) {                                \
            ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:583: +1, nesting level increased to 1

    } else {
      ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:591: +1, including nesting penalty of 0, nesting level increased to 1

    for (int i = 0; i < column_num; ++i) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:593: +2, including nesting penalty of 1, nesting level increased to 2

        if (column_size != 0 && cz != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:593: +1

        if (column_size != 0 && cz != 0) {
                             ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:596: +2, including nesting penalty of 1, nesting level increased to 2

        if (cz != 0) {
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:604: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_fill_partition_columns(block, column_size, _lazy_read_ctx.partition_columns));
    ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:604: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_fill_partition_columns(block, column_size, _lazy_read_ctx.partition_columns));
    ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:605: +1, including nesting penalty of 0, nesting level increased to 1

    RETURN_IF_ERROR(_fill_missing_columns(block, column_size, _lazy_read_ctx.missing_columns));
    ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:605: +2, including nesting penalty of 1, nesting level increased to 2

    RETURN_IF_ERROR(_fill_missing_columns(block, column_size, _lazy_read_ctx.missing_columns));
    ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:606: +1, including nesting penalty of 0, nesting level increased to 1

    if (!_not_single_slot_filter_conjuncts.empty()) {
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:609: +2, including nesting penalty of 1, nesting level increased to 2

            RETURN_IF_CATCH_EXCEPTION(
            ^

be/src/common/exception.h:79: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

    do {                                                                                         \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:610: +3, including nesting penalty of 2, nesting level increased to 3

                    RETURN_IF_ERROR(VExprContext::execute_conjuncts_and_filter_block(
                    ^

be/src/common/status.h:632: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:610: +4, including nesting penalty of 3, nesting level increased to 4

                    RETURN_IF_ERROR(VExprContext::execute_conjuncts_and_filter_block(
                    ^

be/src/common/status.h:634: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:609: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_CATCH_EXCEPTION(
            ^

be/src/common/exception.h:84: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

        } catch (const doris::Exception& e) {                                                    \
          ^

be/src/vec/exec/format/parquet/vparquet_group_reader.cpp:609: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_CATCH_EXCEPTION(
            ^

be/src/common/exception.h:85: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

            if (e.code() == doris::ErrorCode::MEM_ALLOC_FAILED) {                                \
            ^

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.82% (10110/26042)
Line Coverage: 29.73% (84821/285310)
Region Coverage: 28.81% (43564/151223)
Branch Coverage: 25.37% (22141/87280)
Coverage Report: http://coverage.selectdb-in.cc/coverage/815442d2c8d6bda6afcdc583de76ab326331b3f6_815442d2c8d6bda6afcdc583de76ab326331b3f6/report/index.html

@morningman morningman force-pushed the parquet_complex_type_late_materialization branch from 815442d to 20b44a6 Compare December 25, 2024 10:48
@morningman
Copy link
Contributor

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32676 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 20b44a6193e87534f75bc30bd97b3f30b9ef6677, data reload: false

------ Round 1 ----------------------------------
q1	17592	6174	6110	6110
q2	2047	293	166	166
q3	10434	1251	734	734
q4	10205	858	436	436
q5	7529	2177	1959	1959
q6	203	182	152	152
q7	882	759	631	631
q8	9239	1355	1169	1169
q9	5530	4916	4975	4916
q10	6735	2308	1865	1865
q11	473	283	260	260
q12	346	357	230	230
q13	17768	3634	2954	2954
q14	229	232	216	216
q15	563	498	500	498
q16	625	610	584	584
q17	559	853	330	330
q18	6989	6492	6452	6452
q19	1220	946	549	549
q20	310	329	188	188
q21	2852	2196	1961	1961
q22	362	336	316	316
Total cold run time: 102692 ms
Total hot run time: 32676 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6417	7065	6251	6251
q2	229	329	235	235
q3	2259	2624	2349	2349
q4	1446	1806	1355	1355
q5	4313	4764	4653	4653
q6	177	169	140	140
q7	1991	1846	1753	1753
q8	2459	2679	2589	2589
q9	6971	6815	6812	6812
q10	2945	3250	2713	2713
q11	584	527	500	500
q12	634	705	559	559
q13	3188	3650	3009	3009
q14	273	285	280	280
q15	564	509	492	492
q16	644	673	628	628
q17	1200	1695	1244	1244
q18	7112	6960	7049	6960
q19	815	1029	1095	1029
q20	1950	1956	1816	1816
q21	5432	5158	4929	4929
q22	621	601	566	566
Total cold run time: 52224 ms
Total hot run time: 50862 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190737 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 20b44a6193e87534f75bc30bd97b3f30b9ef6677, data reload: false

query1	975	391	380	380
query2	6543	2421	2280	2280
query3	6703	220	212	212
query4	33730	24233	23637	23637
query5	5095	633	478	478
query6	294	206	198	198
query7	4637	497	310	310
query8	307	257	243	243
query9	9698	2781	2762	2762
query10	474	341	253	253
query11	18427	15370	15324	15324
query12	167	108	110	108
query13	1684	561	413	413
query14	12309	6883	7377	6883
query15	238	197	195	195
query16	7896	574	438	438
query17	1535	749	559	559
query18	2065	432	293	293
query19	219	176	151	151
query20	114	110	109	109
query21	210	121	104	104
query22	4381	4485	4366	4366
query23	34151	33471	33375	33375
query24	6002	2297	2284	2284
query25	466	433	369	369
query26	1201	264	152	152
query27	2098	460	330	330
query28	5346	2470	2458	2458
query29	672	560	411	411
query30	229	179	151	151
query31	1006	912	825	825
query32	80	60	59	59
query33	543	342	293	293
query34	758	835	523	523
query35	791	802	765	765
query36	1006	1060	979	979
query37	118	95	84	84
query38	4128	4073	4107	4073
query39	1575	1426	1497	1426
query40	211	115	99	99
query41	47	44	47	44
query42	120	104	104	104
query43	518	529	491	491
query44	1325	817	810	810
query45	181	169	166	166
query46	863	1031	654	654
query47	1862	1900	1830	1830
query48	385	404	328	328
query49	786	466	382	382
query50	616	651	391	391
query51	7307	7195	7204	7195
query52	100	100	90	90
query53	231	256	183	183
query54	484	493	404	404
query55	80	80	81	80
query56	249	250	234	234
query57	1163	1191	1100	1100
query58	236	218	223	218
query59	2969	3124	3055	3055
query60	269	259	238	238
query61	112	115	111	111
query62	888	784	739	739
query63	228	194	198	194
query64	4236	978	686	686
query65	3267	3177	3250	3177
query66	1050	410	306	306
query67	15838	15705	15490	15490
query68	8558	767	548	548
query69	470	290	258	258
query70	1265	1166	1149	1149
query71	453	289	270	270
query72	5828	3779	3831	3779
query73	662	756	366	366
query74	9833	9359	9003	9003
query75	4703	3089	2653	2653
query76	4275	1181	798	798
query77	821	360	280	280
query78	11028	10225	9353	9353
query79	3691	813	606	606
query80	789	547	436	436
query81	464	266	235	235
query82	624	146	123	123
query83	191	166	152	152
query84	282	91	125	91
query85	786	378	304	304
query86	356	329	320	320
query87	4655	4631	4466	4466
query88	4401	2236	2245	2236
query89	419	324	288	288
query90	1930	189	192	189
query91	138	150	101	101
query92	72	56	54	54
query93	1005	715	546	546
query94	664	384	289	289
query95	339	264	251	251
query96	480	601	277	277
query97	2766	2786	2625	2625
query98	233	209	197	197
query99	1718	1596	1442	1442
Total cold run time: 297391 ms
Total hot run time: 190737 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.83% (10103/26020)
Line Coverage: 29.83% (85333/286048)
Region Coverage: 28.98% (43613/150517)
Branch Coverage: 25.51% (22239/87180)
Coverage Report: http://coverage.selectdb-in.cc/coverage/20b44a6193e87534f75bc30bd97b3f30b9ef6677_20b44a6193e87534f75bc30bd97b3f30b9ef6677/report/index.html

@doris-robot
Copy link

ClickBench: Total hot run time: 31.58 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 20b44a6193e87534f75bc30bd97b3f30b9ef6677, data reload: false

query1	0.04	0.04	0.03
query2	0.06	0.03	0.04
query3	0.23	0.07	0.08
query4	1.59	0.11	0.10
query5	0.41	0.40	0.40
query6	1.16	0.66	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.04
query9	0.58	0.52	0.51
query10	0.55	0.58	0.55
query11	0.14	0.11	0.10
query12	0.14	0.11	0.12
query13	0.61	0.60	0.59
query14	2.70	2.79	2.84
query15	0.89	0.82	0.82
query16	0.38	0.39	0.38
query17	1.03	1.06	1.01
query18	0.23	0.21	0.21
query19	1.98	1.86	2.03
query20	0.02	0.01	0.02
query21	15.39	0.93	0.60
query22	0.75	0.92	0.58
query23	15.27	1.40	0.57
query24	2.96	1.24	1.94
query25	0.18	0.12	0.13
query26	0.24	0.15	0.13
query27	0.05	0.07	0.06
query28	14.49	1.50	1.05
query29	12.55	3.95	3.26
query30	0.25	0.10	0.06
query31	2.82	0.59	0.38
query32	3.23	0.55	0.45
query33	3.05	3.12	3.08
query34	16.80	5.07	4.51
query35	4.45	4.45	4.50
query36	0.66	0.50	0.47
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.12
query41	0.08	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.42 s
Total hot run time: 31.58 s

@kaka11chen
Copy link
Contributor Author

run external

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 26, 2024
@morningman morningman merged commit 3cc11d5 into apache:master Dec 26, 2024
23 of 25 checks passed
github-actions bot pushed a commit that referenced this pull request Dec 26, 2024
…x types. (#44098)

### What problem does this PR solve?

Problem Summary:
Late materialization is not supported when querying fields with complex
types.

### Release note
[opt](parquet-reader)Implement late materialization of parquet complex types.
morningman pushed a commit that referenced this pull request Dec 31, 2024
…rquet complex types. #44098 (#45985)

Cherry-picked from #44098

Co-authored-by: Qi Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.x-experimental dev/3.0.4-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants